Exact Inference on Manycore Processors Using Pointer Jumping
نویسندگان
چکیده
Exact inference is a key problem in exploring probabilistic graphical models. Most parallel algorithms for exact inference explore data and structural parallelism. These algorithms result in limited performance if the input model offers limited data and structural parallelism. In this paper, we study a pointer jumping based method on manycore systems for exact inference in junction trees. We adapt the technique for both evidence collection and evidence distribution so as to efficiently process junction trees with multiple evidence cliques. We also study the impact of junction tree topology on evidence collection. We implement the proposed method on state-of-the-art manycore systems. Experimental results show that, for junction trees with limited data and structural parallelism, pointer jumping is well suited to accelerate exact inference on manycore systems.
منابع مشابه
Parallel Exact Inference
In this paper, we present complete message-passing implementation that shows scalable performance while performing exact inference on arbitrary Bayesian networks. Our work is based on a parallel version of the classical technique of converting a Bayesian network to a junction tree before computing inference. We propose a parallel algorithm for constructing potential tables for a junction tree a...
متن کاملParallel Implementation of Borvka's Minimum Spanning Tree Algorithm
We study parallel algorithms for the minimum spanning tree problem, based on the sequential algorithm of Borůvka. The target architectures for our algorithm are asynchronous, distributed-memory machines. Analysis of our parallel algorithm, on a simple model that is reminiscent of the LogP model, shows that in principle a speedup proportional to the number of processors can be achieved, but that...
متن کاملFPGA Prototyping of Manycore Multinode Systems for Irregular Applications
Knowledge discovery applications are an emerging class of irregular applications that exploit graph-based data structures, present poor locality and analyze very big data sets that require multi-node systems for processing. Current commodity clusters, which exploit cachebased processors, usually perform poorly with these applications. To address their requirements, full-custom machines, like th...
متن کاملA fast parallel algorithm to determine edit distance
We consider the problem of determining in parallel the cost of converting a source string to a destination string by a sequence of insert, delete and transform operations. Each operation has an integer cost in some fixed range. We present an algorithm that runs in <9(logmlogrt) time and uses mn processors on a CRCW PRAM, where m and n are the lengths of the strings. The best known sequential al...
متن کاملParallel Solutions of Simple Index Recurrence Equations
We deene a new type of recurrence equation called \Simple Indexed Recurrences" (SIR). In this type of equation, ordinary recurrences of the form Xi] = opi(Xi ? 1]; Xi]) i = 1 : : : n are generalized to Xg(i)] = opi(Xf(i)];Xg(i)]), where f; g : f1 : : : ng 7 ! f1 : : : mg and g is distinct. This enables us to model sequential loops of the form for i = 1 to n Xg(i)] = opi(Xf(i)];Xg(i)]); as a seq...
متن کامل